fix: fix gpt oss export + bump mbridge by yuki-97 · Pull Request #2249 · NVIDIA-NeMo/RL

yuki-97 · 2026-04-10T14:24:03Z

Previously we will get gpt-oss model with error layout from examples/converters/convert_megatron_to_hf.py, this PR will fix it. See NVIDIA-NeMo/Megatron-Bridge#3271 for more details.

Validate Steps:

Import hf to megatron, train one step, and save the megatron ckpt.

NRL_FORCE_REBUILD_VENVS=true \
uv run python examples/run_grpo.py \
    --config examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml \
    grpo.max_num_steps=1 \
    policy.max_total_sequence_length=512 \
    logger.wandb_enabled=false \
    logger.tensorboard_enabled=false \
    checkpointing.enabled=True \
    checkpointing.checkpoint_dir=results/grpo-gptoss-20b-8n8g-megatron-test-export-transpose \
    checkpointing.save_period=1

Convert saved megatron ckpt to hf.

uv run --extra mcore python examples/converters/convert_megatron_to_hf.py \
    --config results/grpo-gptoss-20b-8n8g-megatron-test-export-transpose/step_1/config.yaml \
    --hf-model-name unsloth/gpt-oss-20b-BF16 \
    --megatron-ckpt-path results/grpo-gptoss-20b-8n8g-megatron-test-export-transpose/step_1/policy/weights/iter_0000000 \
    --hf-ckpt-path results/step_1_hf

Use the converted hf ckpt to train again.

uv run python examples/run_grpo.py \
    --config examples/configs/recipes/llm/grpo-gptoss-20b-8n8g-megatron.yaml \
    policy.model_name=results/step_1_hf \
    grpo.max_num_steps=1 \
    policy.max_total_sequence_length=512 \
    logger.wandb_enabled=false \
    logger.tensorboard_enabled=false \
    checkpointing.enabled=false

Results of "Validate Steps" 3:
Before this PR:

  • Generation KL Error: 13.0520
  • Avg Reward: 0.0000

After this PR:

  • Generation KL Error: 0.0009
  • Avg Reward: 0.3960

copy-pr-bot · 2026-04-10T14:24:34Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Yuki Huang <yukih@nvidia.com>

yuki-97 · 2026-04-11T13:00:35Z

/ok to test 41fc91d

github-actions · 2026-04-11T13:01:02Z

✅ Submodule Fast-Forward Check Results

Check based on commit: 41fc91d (PR #2249 from yukih/fix-gpt-oss)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)
Megatron-LM: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

yuki-97 mentioned this pull request Apr 10, 2026

[model] fix: fix gpt oss export NVIDIA-NeMo/Megatron-Bridge#3271

Merged

yuki-97 added 4 commits April 11, 2026 02:04

move down_proj handle to vllm

db40f9e

Signed-off-by: Yuki Huang <yukih@nvidia.com>

bump mbridge

6414ed0

Signed-off-by: Yuki Huang <yukih@nvidia.com>

fix bump

689b107

Signed-off-by: Yuki Huang <yukih@nvidia.com>

add comment

41fc91d

Signed-off-by: Yuki Huang <yukih@nvidia.com>

yuki-97 force-pushed the yukih/fix-gpt-oss branch from 6fa2609 to 41fc91d Compare April 11, 2026 12:59

yuki-97 marked this pull request as ready for review April 11, 2026 13:00

yuki-97 requested review from a team as code owners April 11, 2026 13:00

yuki-97 added the CI:L1 Run doctests, unit tests, and functional tests label Apr 11, 2026

copy-pr-bot bot temporarily deployed to nemo-ci April 11, 2026 13:00 Inactive

yuki-97 requested review from cuichenx, terrykong and yaoyu-33 April 11, 2026 13:01

yuki-97 added the r0.6.0 label Apr 11, 2026

yuki-97 changed the title ~~fix: fix gpt oss export~~ fix: fix gpt oss export + bump mbridge Apr 11, 2026

copy-pr-bot bot temporarily deployed to nemo-ci April 11, 2026 14:43 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci April 11, 2026 16:16 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fix gpt oss export + bump mbridge#2249

fix: fix gpt oss export + bump mbridge#2249
yuki-97 wants to merge 4 commits intomainfrom
yukih/fix-gpt-oss

yuki-97 commented Apr 10, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Apr 10, 2026

Uh oh!

yuki-97 commented Apr 11, 2026

Uh oh!

github-actions bot commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yuki-97 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

copy-pr-bot bot commented Apr 10, 2026

Uh oh!

yuki-97 commented Apr 11, 2026

Uh oh!

github-actions bot commented Apr 11, 2026

✅ Submodule Fast-Forward Check Results

✅ Submodules that are properly updated:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yuki-97 commented Apr 10, 2026 •

edited

Loading